terrible performance
Terrible performance using XGBoost H2O
I am training a XGBoost model using 5-fold croos validation on a very imbalanced binary classification problem. The dataset has 1200 columns (multi-document word2vec document embeddings). The reported performance on train data was extremely high (probably overfitting!!!): I know H2O cross validation generates an extra model using the whole data available and different performances are expected. But, could be the cause that generated too bad performance on the resulting model?